A Comparative Study of Parameter Estimation Methods for Statistical Natural Language Processing

نویسندگان

  • Jianfeng Gao
  • Galen Andrew
  • Mark Johnson
  • Kristina Toutanova
چکیده

This paper presents a comparative study of five parameter estimation algorithms on four NLP tasks. Three of the five algorithms are well-known in the computational linguistics community: Maximum Entropy (ME) estimation with L2 regularization, the Averaged Perceptron (AP), and Boosting. We also investigate ME estimation with L1 regularization using a novel optimization algorithm, and BLasso, which is a version of Boosting with Lasso (L1) regularization. We first investigate all of our estimators on two re-ranking tasks: a parse selection task and a language model (LM) adaptation task. Then we apply the best of these estimators to two additional tasks involving conditional sequence models: a Conditional Markov Model (CMM) for part of speech tagging and a Conditional Random Field (CRF) for Chinese word segmentation. Our experiments show that across tasks, three of the estimators — ME estimation with L1 or L2 regularization, and AP — are in a near statistical tie for first place.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of estimation methods for parameters of the probability functions in tree diameter distribution modeling

One of the most commonly used statistical models for characterizing the variations of tree diameter at breast height is Weibull distribution. The usual approach for estimating parameters of a statistical model is the maximum likelihood estimation (likelihood method). Usually, this works based on iterative algorithms such as Newton-Raphson. However, the efficiency of the likelihood method is not...

متن کامل

A New Correlation Based on Multi-Gene Genetic Programming for Predicting the Sweet Natural Gas Compressibility Factor

Gas compressibility factor (z-factor) is an important parameter widely applied in petroleum and chemical engineering. Experimental measurements, equations of state (EOSs) and empirical correlations are the most common sources in z-factor calculations. However, these methods have serious limitations such as being time-consuming as well as those from a computational point of view, like instabilit...

متن کامل

Assessment of Empirical Methods of Runoff Estimation by Statistical test (Case study: BanadakSadat Watershed, Yazd Province)

Runoff estimation resulted from precipitation is the basis of more study in various develop and exploit design from water resource, then its measure and calculation due to environmental bottlenecks, always have a plenty problem. As a result of the importance of output runoff estimation and flood volume in watershed for the sake of country integrated watershed management in this study tried to 9...

متن کامل

Assessment of Empirical Methods of Runoff Estimation by Statistical test (Case study: BanadakSadat Watershed, Yazd Province)

Runoff estimation resulted from precipitation is the basis of more study in various develop and exploit design from water resource, then its measure and calculation due to environmental bottlenecks, always have a plenty problem. As a result of the importance of output runoff estimation and flood volume in watershed for the sake of country integrated watershed management in this study tried to 9...

متن کامل

Assessment of Empirical Methods of Runoff Estimation by Statistical test (Case study: BanadakSadat Watershed, Yazd Province)

Runoff estimation resulted from precipitation is the basis of more study in various develop and exploit design from water resource, then its measure and calculation due to environmental bottlenecks, always have a plenty problem. As a result of the importance of output runoff estimation and flood volume in watershed for the sake of country integrated watershed management in this study tried to 9...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007